AITopics | multimedia tool and application

Collaborating Authors

multimedia tool and application

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring AI in Steganography and Steganalysis: Trends, Clusters, and Sustainable Development Potential

Sahu, Aditya Kumar, Kumar, Chandan, Kumar, Saksham, Solak, Serdar

arXiv.org Artificial IntelligenceNov-18-2025

Steganography and steganalysis are strongly related subjects of information security. Over the past decade, many powerful and efficient artificial intelligence (AI) - driven techniques have been designed and presented during research into steganography as well as steganalysis. This study presents a scientometric analysis of AI-driven steganography-based data hiding techniques using a thematic modelling approach. A total of 654 articles within the time span of 2017 to 2023 have been considered. Experimental evaluation of the study reveals that 69% of published articles are from Asian countries. The China is on top (TP:312), followed by India (TP-114). The study mainly identifies seven thematic clusters: steganographic image data hiding, deep image steganalysis, neural watermark robustness, linguistic steganography models, speech steganalysis algorithms, covert communication networks, and video steganography techniques. The proposed study also assesses the scope of AI-steganography under the purview of sustainable development goals (SDGs) to present the interdisciplinary reciprocity between them. It has been observed that only 18 of the 654 articles are aligned with one of the SDGs, which shows that limited studies conducted in alignment with SDG goals. SDG9 which is Industry, Innovation, and Infrastructure is leading among 18 SDGs mapped articles. To the top of our insight, this study is the unique one to present a scientometric study on AI-driven steganography-based data hiding techniques. In the context of descriptive statistics, the study breaks down the underlying causes of observed trends, including the influence of DL developments, trends in East Asia and maturity of foundational methods. The work also stresses upon the critical gaps in societal alignment, particularly the SDGs, ultimately working on unveiling the field's global impact on AI security challenges.

data mining, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.12052

Country: Asia > China (0.67)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
(5 more...)

Add feedback

Tracking daily paths in home contexts with RSSI fingerprinting based on UWB through deep learning models

Polo-Rodríguez, Aurora, Valera, Juan Carlos, Peral, Jesús, Gil, David, Medina-Quero, Javier

arXiv.org Artificial IntelligenceSep-9-2025

The field of human activity recognition has evolved significantly, driven largely by advancements in Internet of Things (IoT) device technology, particularly in personal devices. This study investigates the use of ultra-wideband (UWB) technology for tracking inhabitant paths in home environments using deep learning models. UWB technology estimates user locations via time-of-flight and time-difference-of-arrival methods, which are significantly affected by the presence of walls and obstacles in real environments, reducing their precision. To address these challenges, we propose a fingerprinting-based approach utilizing received signal strength indicator (RSSI) data collected from inhabitants in two flats (60 m2 and 100 m2) while performing daily activities. We compare the performance of convolutional neural network (CNN), long short-term memory (LSTM), and hybrid CNN+LSTM models, as well as the use of Bluetooth technology. Additionally, we evaluate the impact of the type and duration of the temporal window (future, past, or a combination of both). Our results demonstrate a mean absolute error close to 50 cm, highlighting the superiority of the hybrid model in providing accurate location estimates, thus facilitating its application in daily human activity recognition in residential settings.

artificial intelligence, machine learning, multimedia tool and application, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-024-19914-1

2509.06161

Country: Europe > Spain (0.46)

Genre: Research Report > New Finding (0.86)

Industry:

Information Technology > Security & Privacy (0.67)
Materials > Metals & Mining (0.67)
Information Technology > Smart Houses & Appliances (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Proposing a Semantic Movie Recommendation System Enhanced by ChatGPT's NLP Results

Fallahi, Ali, Bastanfard, Azam, Amini, Amineh, Saboohi, Hadi

arXiv.org Artificial IntelligenceJul-30-2025

The importance of recommender systems on the web has grown, especially in the movie industry, with a vast selection of options to watch. To assist users in traversing available items and finding relevant results, recommender systems analyze operational data and investigate users' tastes and habits. Providing highly individualized suggestions can boost user engagement and satisfaction, which is one of the fundamental goals of the movie industry, significantly in online platforms. According to recent studies and research, using knowledge-based techniques and considering the semantic ideas of the textual data is a suitable way to get more appropriate results. This study provides a new method for building a knowledge graph based on semantic information. It uses the ChatGPT, as a large language model, to assess the brief descriptions of movies and extract their tone of voice. Results indicated that using the proposed method may significantly enhance accuracy rather than employing the explicit genres supplied by the publishers.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.2177

Country: Asia > Middle East > Iran (0.14)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Heart disease risk prediction using deep learning techniques with feature augmentation

García-Ordás, María Teresa, Bayón-Gutiérrez, Martín, Benavides, Carmen, Aveleira-Mata, Jose, Benítez-Andrades, José Alberto

arXiv.org Artificial IntelligenceFeb-8-2024

Cardiovascular diseases state as one of the greatest risks of death for the general population. Late detection in heart diseases highly conditions the chances of survival for patients. Age, sex, cholesterol level, sugar level, heart rate, among other factors, are known to have an influence on life-threatening heart problems, but, due to the high amount of variables, it is often difficult for an expert to evaluate each patient taking this information into account. In this manuscript, the authors propose using deep learning methods, combined with feature augmentation techniques for evaluating whether patients are at risk of suffering cardiovascular disease. The results of the proposed methods outperform other state of the art methods by 4.4%, leading to a precision of a 90%, which presents a significant improvement, even more so when it comes to an affliction that affects a large population.

multimedia tool and application, neural network, prediction, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-023-14817-z

2402.05495

Country:

Europe > Switzerland (0.04)
Europe > Spain > Castile and León > León Province > León (0.04)
Asia > India (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Determining the severity of Parkinson's disease in patients using a multi task neural network

García-Ordás, María Teresa, Benítez-Andrades, José Alberto, Aveleira-Mata, Jose, Alija-Pérez, José-Manuel, Benavides, Carmen

arXiv.org Artificial IntelligenceFeb-8-2024

Parkinson's disease is easy to diagnose when it is advanced, but it is very difficult to diagnose in its early stages. Early diagnosis is essential to be able to treat the symptoms. It impacts on daily activities and reduces the quality of life of both the patients and their families and it is also the second most prevalent neurodegenerative disorder after Alzheimer in people over the age of 60. Most current studies on the prediction of Parkinson's severity are carried out in advanced stages of the disease. In this work, the study analyzes a set of variables that can be easily extracted from voice analysis, making it a very non-intrusive technique. In this paper, a method based on different deep learning techniques is proposed with two purposes. On the one hand, to find out if a person has severe or non-severe Parkinson's disease, and on the other hand, to determine by means of regression techniques the degree of evolution of the disease in a given patient. The UPDRS (Unified Parkinson's Disease Rating Scale) has been used by taking into account both the motor and total labels, and the best results have been obtained using a mixed multi-layer perceptron (MLP) that classifies and regresses at the same time and the most important features of the data obtained are taken as input, using an autoencoder. A success rate of 99.15% has been achieved in the problem of predicting whether a person suffers from severe Parkinson's disease or non-severe Parkinson's disease. In the degree of disease involvement prediction problem case, a MSE (Mean Squared Error) of 0.15 has been obtained. Using a full deep learning pipeline for data preprocessing and classification has proven to be very promising in the field Parkinson's outperforming the state-of-the-art proposals.

architecture, multimedia tool and application, parkinson, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-023-14932-x

2402.05491

Country:

Europe > Spain > Castile and León > León Province > León (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine learning based biomedical image processing for echocardiographic images

Heena, Ayesha, Biradar, Nagashettappa, Maroof, Najmuddin M., Bhatia, Surbhi, Agarwal, Rashmi, Prasad, Kanta

arXiv.org Artificial IntelligenceMar-16-2023

The popularity of Artificial intelligence and machine learning have prompted researchers to use it in the recent researches. The proposed method uses K-Nearest Neighbor (KNN) algorithm for segmentation of medical images, extracting of image features for analysis by classifying the data based on the neural networks. Classification of the images in medical imaging is very important, KNN is one suitable algorithm which is simple, conceptual and computational, which provides very good accuracy in results. KNN algorithm is a unique user-friendly approach with wide range of applications in machine learning algorithms which are majorly used for the various image processing applications including classification, segmentation and regression issues of the image processing. The proposed system uses gray level co-occurrence matrix features. The trained neural network has been tested successfully on a group of echocardiographic images, errors were compared using regression plot. The results of the algorithm are tested using various quantitative as well as qualitative metrics and proven to exhibit better performance in terms of both quantitative and qualitative metrics in terms of current state -of-the-art methods in the related area. To compare the performance of trained neural network the regression analysis performed showed a good correlation.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-022-13516-5

2303.09103

Country:

Asia > India > Karnataka (0.05)
Asia > Middle East > Israel (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.87)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

MMFL-Net: Multi-scale and Multi-granularity Feature Learning for Cross-domain Fashion Retrieval

Bao, Chen, Zhang, Xudong, Chen, Jiazhou, Miao, Yongwei

arXiv.org Artificial IntelligenceOct-26-2022

Instance-level image retrieval in fashion is a challenging issue owing to its increasing importance in real-scenario visual fashion search. Cross-domain fashion retrieval aims to match the unconstrained customer images as queries for photographs provided by retailers; however, it is a difficult task due to a wide range of consumer-to-shop (C2S) domain discrepancies and also considering that clothing image is vulnerable to various non-rigid deformations. To this end, we propose a novel multi-scale and multi-granularity feature learning network (MMFL-Net), which can jointly learn global-local aggregation feature representations of clothing images in a unified framework, aiming to train a cross-domain model for C2S fashion visual similarity. First, a new semantic-spatial feature fusion part is designed to bridge the semantic-spatial gap by applying top-down and bottom-up bidirectional multi-scale feature fusion. Next, a multi-branch deep network architecture is introduced to capture global salient, part-informed, and local detailed information, and extracting robust and discrimination feature embedding by integrating the similarity learning of coarse-to-fine embedding with the multiple granularities. Finally, the improved trihard loss, center loss, and multi-task classification loss are adopted for our MMFL-Net, which can jointly optimize intra-class and inter-class distance and thus explicitly improve intra-class compactness and inter-class discriminability between its visual representations for feature learning. Furthermore, our proposed model also combines the multi-task attribute recognition and classification module with multi-label semantic attributes and product ID labels. Experimental results demonstrate that our proposed MMFL-Net achieves significant improvement over the state-of-the-art methods on the two datasets, DeepFashion-C2S and Street2Shop. Specifically, our approach exceeds the current best method by a large margin of +4.2% and +11.4% for mAP and Acc@1, respectively, on the most challenging dataset DeepFashion-C2S.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-022-13648-8

2210.15128

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Retail (0.48)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

A Novel Enhanced Convolution Neural Network with Extreme Learning Machine: Facial Emotional Recognition in Psychology Practices

Banskota, Nitesh, Alsadoon, Abeer, Prasad, P. W. C., Dawoud, Ahmed, Rashid, Tarik A., Alsadoon, Omar Hisham

arXiv.org Artificial IntelligenceAug-4-2022

Facial emotional recognition is one of the essential tools used by recognition psychology to diagnose patients. Face and facial emotional recognition are areas where machine learning is excelling. Facial Emotion Recognition in an unconstrained environment is an open challenge for digital image processing due to different environments, such as lighting conditions, pose variation, yaw motion, and occlusions. Deep learning approaches have shown significant improvements in image recognition. However, accuracy and time still need improvements. This research aims to improve facial emotion recognition accuracy during the training session and reduce processing time using a modified Convolution Neural Network Enhanced with Extreme Learning Machine (CNNEELM). The system entails (CNNEELM) improving the accuracy in image registration during the training session. Furthermore, the system recognizes six facial emotions happy, sad, disgust, fear, surprise, and neutral with the proposed CNNEELM model. The study shows that the overall facial emotion recognition accuracy is improved by 2% than the state of art solutions with a modified Stochastic Gradient Descent (SGD) technique. With the Extreme Learning Machine (ELM) classifier, the processing time is brought down to 65ms from 113ms, which can smoothly classify each frame from a video clip at 20fps. With the pre-trained InceptionV3 model, the proposed CNNEELM model is trained with JAFFE, CK+, and FER2013 expression datasets. The simulation results show significant improvements in accuracy and processing time, making the model suitable for the video analysis process. Besides, the study solves the issue of the large processing time required to process the facial images.

accuracy, processing time, recognition, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-022-13567-8

2208.02953

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mixed Reality using Illumination-aware Gradient Mixing in Surgical Telepresence: Enhanced Multi-layer Visualization

Puri, Nirakar, Alsadoon, Abeer, Prasad, P. W. C., Alsalami, Nada, Rashid, Tarik A.

arXiv.org Artificial IntelligenceAug-21-2021

Background and aim: Surgical telepresence using augmented perception has been applied, but mixed reality is still being researched and is only theoretical. The aim of this work is to propose a solution to improve the visualization in the final merged video by producing globally consistent videos when the intensity of illumination in the input source and target video varies. Methodology: The proposed system uses an enhanced multi-layer visualization with illumination-aware gradient mixing using Illumination Aware Video Composition algorithm. Particle Swarm Optimization Algorithm is used to find the best sample pair from foreground and background region and image pixel correlation to estimate the alpha matte. Particle Swarm Optimization algorithm helps to get the original colour and depth of the unknown pixel in the unknown region. Result: Our results showed improved accuracy caused by reducing the Mean squared Error for selecting the best sample pair for unknown region in 10 each sample for bowel, jaw and breast. The amount of this reduction is 16.48% from the state of art system. As a result, the visibility accuracy is improved from 89.4 to 97.7% which helped to clear the hand vision even in the difference of light. Conclusion: Illumination effect and alpha pixel correlation improves the visualization accuracy and produces a globally consistent composition results and maintains the temporal coherency when compositing two videos with high and inverse illumination effect. In addition, this paper provides a solution for selecting the best sampling pair for the unknown region to obtain the original colour and depth.

illumination-aware gradient mixing, pixel, video, (9 more...)

arXiv.org Artificial Intelligence

2110.09318

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States (0.04)
Asia > Middle East > Iraq > Kurdistan Region (0.04)
Asia > Middle East > Iraq > Erbil Governorate > Erbil (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

ReDMark: Framework for Residual Diffusion Watermarking on Deep Networks

Ahmadi, Mahdi, Norouzi, Alireza, Soroushmehr, S. M. Reza, Karimi, Nader, Najarian, Kayvan, Samavi, Shadrokh, Emami, Ali

arXiv.org Machine LearningOct-16-2018

This work has been submitted to the IEEE for possible publication. Abstract--Due to the rapid growth of machine learning tools and specifically deep networks in various computer vision and image processing areas, applications of Convolutional Neural Networks for watermarking have recently emerged. In this paper, we propose a deep end-to-end diffusion watermarking framework (ReDMark) which can be adapted for any desired transform space. The framework is composed of two Fully Convolutional Neural Networks with the residual structure for embedding and extraction. The whole deep network is trained end-to-end to conduct a blind secure watermarking. The framework is customizable for the level of robustness vs. imperceptibility. It is also adjustable for the tradeoff between capacity and robustness. The proposed framework simulates various attacks as a differentiable network layer to facilitate end-to-end training. For JPEG attack, a differentiable approximation is utilized, which drastically improves the watermarking robustness to this attack. Another important characteristic of the proposed framework, which leads to improved security and robustness, is its capability to diffuse watermark data among a relatively wide area of the image. Comparative results versus recent state-of-the-art researches highlight the superiority of the proposed framework in terms of imperceptibility and robustness. IGITAL watermarking was originally introduced in 1979 for anti-counterfeit purposes [1] to distinguish between the original and counterfeit documents. Since then it has been applied for identification of image ownership and protection of intellectual property by hiding data such as logos and proprietary information in images, videos and audios [2]. Another application is the patient identification and medical procedure matching by hiding patients' personal information Mahdi Ahmadi, Alireza Norouzi and Nader Karimi (Member, IEEE) are with the Department of Electrical and Computer Engineering, Isfahan University of Technology, 84156-83111, Iran. S.M.Reza Soroushmehr (Member, IEEE) is with the Dept. of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109 U.S.A (Email: ssoroush@med.umich.edu)

artificial intelligence, machine learning, watermarked image, (15 more...)

arXiv.org Machine Learning

1810.07248

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.24)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback